home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
ftp.cs.arizona.edu
/
ftp.cs.arizona.edu.tar
/
ftp.cs.arizona.edu
/
icon
/
newsgrp
/
group93c.txt
/
000062_icon-group-sender _Mon Sep 20 14:57:32 1993.msg
< prev
next >
Wrap
Internet Message Format
|
1994-02-02
|
2KB
Received: from owl.CS.Arizona.EDU by cheltenham.CS.Arizona.EDU; Mon, 20 Sep 1993 15:07:35 MST
Received: by owl.cs.arizona.edu; Mon, 20 Sep 1993 15:07:34 MST
Date: Mon, 20 Sep 93 14:57:32 -0400
From: ptho@seq1.loc.gov (Phillip Lee Thomas)
Message-Id: <9309201857.AA05326@seq1.loc.gov>
To: icon-group@cs.arizona.edu
Subject: Yet another question...
Status: R
Errors-To: icon-group-errors@cs.arizona.edu
About Chris Fagyal's text search problem...
There is too little information in the problem to give an
optimal answer. Problems are: 1) are all data lines of two
lines each? 2) How big is the database?
Brute force: Your index lines are totally numeric, so 1)match for
a number in column one and see if you can convert the whole line
into a number. If you can and the number is the one you are looking
for, keep reading until the next index.
Indexed: Keep a separate file of where each number is located and
seek to that location. The index file could be kept in memory for
faster access.
Binary: Seek to the middle of your file, read until you get a number
line. If the number is too big, seek half way between the start and
the current position, etc. Be sure to test your boundaries so that
you can retrieve the first and last index lines.
Your main problems are over whether the entire file can be kept in
memory or only the index or neither and whether you have some structure
to the data lines.
The icon library has some functions that might provide a model: see
idxtext and associated procedures.
Phillip Lee Thomas
Library of Congress
ptho@seq1.loc.gov